Effect of clonal testing on the efficiency of genomic evaluation in forest tree breeding

Through stochastic simulations, accuracies of breeding values and response to selection were assessed under traditional pedigree-(BLUP) and genomic-based evaluation methods (GBLUP) in forest tree breeding. The latter provides a methodological foundation for genomic selection. We evaluated the impact of clonal replication in progeny testing on the response to selection realized in seed orchards under variable marker density and target effective population sizes. We found that clonal replication in progeny trials boosted selection accuracy, thus providing additional genetic gains under BLUP. While a similar trend was observed for GBLUP, however, the added gains did not surpass those under BLUP. Therefore, breeding programs deploying extensive progeny testing with clonal propagation might not benefit from the deployment of genomic information. These findings could be helpful in the context of operational breeding programs.

www.nature.com/scientificreports/ genotypes) in progeny trials, enhancing the precision of forward selection 8,9 . In Sweden, clonal replication in progeny testing has provided operational benefits in the Norway spruce breeding program by boosting the within-family response to selection while minimizing genetic diversity loss 9,10 . Under fixed progeny test size, a trade-off exists between the family size and the number of clonal propagules per genotype (N R ), i.e., clonal size 11 .
Here, building on our earlier stochastic simulations 5 , we evaluated the impact of clonal replication in progeny testing on the efficiency of BLUP and GBLUP evaluation and the actual genetic response realized in seed orchards. Specifically, we assessed the combined effect of marker density, effective population size (N e ), family size, and N R .

Methods
We utilized a stochastic simulation model developed in R 12 . We created parental and offspring populations 5 using the function "glSim" implemented within the R package "adegenet" 13 to generate allelic frequencies in a founder population. Linkage disequilibrium (LD) was set to reflect typical values in outcrossed forest trees 6 . We generated offspring populations (50 parents) of two different sizes using a single pair mating design (SPM) to evaluate the impact of family size (80/160), so the overall population size varied from 2050 to 4050 individuals.
Bi-allelic marker data were simulated for the full-sib families using the function "genomesim" implemented within the R package "pedantics" 14 . We set the number of markers (SNPs) per centiMorgan (cM) to 1, 5, and 10 covering chromosome lengths of 120 cM. In total, two chromosomes (linkage groups) were simulated, comparably to the previous studies 3 , with the maximum number of markers equal to 2400. As the impact of traits' genetic architecture was evaluated earlier 5 , we modeled only a fixed QTL number (N QTL = 200). QTL effects were randomly assigned to selected loci and were sampled from a standardized normal distribution to emulate polygenic traits. We generated phenotypic data as the sum of allelic effects across all QTL loci with the addition of residual effects reflecting h 2 = 0.2, which approximates growth traits in forest tree species. Clonal replicates were derived as the sum of a genotypic value and the average of N R independent samples of residual effects. We conducted 200 independent stochastic iterations of the above scenarios.
We conducted separate genetic evaluations in ASReml software V.3 15 for pedigree-(BLUP) and genomic-based (GBLUP) relationships to predict offspring breeding value (BV; i.e., forward selection) using the animal model in the REML framework 16 . The marker-based relationship matrix G was constructed as follows 17 : where Z is M-P, M is the marker matrix containing genotypes coded as 0, 1, and 2 for the first allele homozygote, heterozygote, and second allele homozygote, and P is the vector of doubled frequencies of the second allele, p is the frequency of the second allele at the loci.
Breeding value (BV) accuracy was calculated as the correlation between their predicted (genetic evaluation of both BLUP and GBLUP strategies) and true values (as determined by the sum of simulated allelic effects). Next, the reported standard error of the overall accuracy across 200 iterations (calculated as the respective standard deviation divided by the square root of the iteration count). Following the genetic evaluation, a set of unrelated offspring with top breeding values (considered as parents in seed orchards) was chosen by mathematical programming 18 to meet the predetermined effective population size (N e = 5, 10, 20, and 25), thus maximizing the genetic response 19 . The method selects the best set of offspring individuals, maximizing the average additive genetic value (genetic gain) while meeting the declared effective population size (constraint). Relatedness among the selected trees was not permitted to avoid inbreeding in the seed orchard's crop. Optimization was conducted in Gurobi software 18 . Details on the optimization algorithm are provided in Lstibůrek and Hodge 19 .

Results
Accuracy of predicted breeding values. Table 1 provides BV's accuracies at variable family size, N R , and marker density. Under GBLUP, a steady increase in BV's accuracy is visible with higher marker density (mainly between 1 and 5 SNPs/cM). BV's accuracy under GBLUP was greater than BLUP under all investigated scenarios after marker density reached 5 SNPs/cM. However, under 1 SNP/cM, BV's accuracies of GBLUP were inferior to BLUP irrespective of N R and the family size.
Clonal replication boosted accuracies of both BLUP and GBLUP evaluations across all marker densities and family sizes. The difference is visible primarily between one to six clonal copies, while additional clonal replications (up to 12) provided lower increments. At low marker density (1 SNP/cM), the accuracy of GBLUP ranged Table 1. Accuracy of breeding values as a function of family size, clonal size N R , and marker density (SNPs/ cM). N R = 1 means no-cloning (1 ramet per clone).

Family size (80)
Family size (160) Selection response. In Table 2, we present differences in the standardized response to selection between GBLUP and BLUP. Relative differences are provided in Table S2. The impact of added marker density was most significant between 1 and 5 SNPs/cM, and it was diminishing towards the 10 SNPs/cM, primarily under the family size 80. Under clonal replication (N R = 6-12), GBLUP yielded minor benefit in selection response under 5-10 SNPs/ cM, but the difference was not statistically significant (alpha = 0.05). The impact of N R on the absolute selection response of both methods is prominent irrespective of the marker density, primarily between 1 and 6 clonal replicates (yet additional gain was generated under N R = 12). While the clonal replication improves gains of both evaluation methods, the major boost of selection response was observed under the BLUP. Under the lowest marker density, i.e., 1 SNP/cM, BLUP generated a higher selection response in the range of app. 0.4-1.3 standard deviations. Under N R = 1 (no cloning), GBLUP was superior to BLUP, primarily under higher marker densities and larger families (see Table 2, N R = 1). The above trends were generally true across the range of N e , yet the added difference between the two methods was diluted at larger N R . Under N R = 1 and moderate marker density (5 SNP/cM), a larger family size (160) boosted the difference between the two methods. Note that values in Table 2 are differences among standardized genetic gains of BLUP and GBLUP. Thus, they are not reflecting baseline genetic gains, e.g., a significant drop of selection response with added N e . Under N R = 1, low marker density (1 SNP/cM), N e = 5, the absolute difference − 0.52 (Table 2) reflects standardized gains of 0.88 (GBLUP) and 1.4 (BLUP). Assuming the same parameters, but N e = 25, the absolute difference − 0.38 reflects a significant drop in standardized gains due to lower selection intensity, i.e., 0.13 (GBLUP) and 0.51 (BLUP). For clarity, standardized genetic gains of all strategies are provided in Table S1. Relative genetic gains, i.e., ratios of the standardized genetic response of GBLUP/BLUP, are provided in Table S2.

Discussion
Here, we estimated the relative efficiency of the genomic evaluation protocol over the traditional phenotypic alternative. Our findings resemble animal and plant breeding studies, i.e., the added prediction accuracy and anticipated selection response under genomic evaluation. This relative superiority of GBLUP is conditional on dense marker coverage, lower narrow-sense heritabilities, and the presence of the population-wise linkage disequilibrium 1,3,5,20,21 . In operational tree breeding programs, additional factors contribute to the breeding efficiency, e.g., sizes of breeding and production populations, mating design, progeny test size and configuration, maximum acceptable inbreeding rate, extend of genotype by environment interactions, cost and time parameters of breeding activities, etc. (see 7 for introduction to forest tree genetics and breeding).
The novelty of our comparison is attributed to the inclusion of clonal replication in progeny test trials as used in operational tree breeding programs to boost selection accuracy. The main added value of the GBLUP evaluation is its ability to capitalize on capturing within-family additive genetic variance and unmasking cryptic relatedness 22,23 . Under all investigated scenarios, the relative genetic gain efficiency of GBLUP decreased with added clonal propagules per offspring individual (N R ). Under 1 SNP/cM, BLUP provided a significantly higher genetic response over the GBLUP across the whole range of N e and family sizes. Both evaluations yielded comparable genetic gains with denser marker coverage (5-10 SNPs/cM); differences were not significant (alpha = 0.05). This finding implies that combining both cloning and genomic evaluation does not bring added genetic response. Thus, our results could inspire breeders to consider two broader alternatives. One involves investing resources Table 2. Differences in standardized genetic gains between GBLUP and BLUP for combinations of N R , N e (5, 10, 20, 25), marker density (1, 5, 10 SNPs/cM), and family size 80 (top table) and 160 (bottom table). An asterisk indicates significant differences (alpha = 0.05). N R = 1 means no-cloning (1 ramet per clone). N R (1, 6, 12 www.nature.com/scientificreports/ into clonal test trials, the other one to genomic evaluation. In agreement with previous studies 8,9 , relatively low N R (6) provided sufficient accuracy. While both strategies benefited in prediction accuracies from the added N R , their relative efficiency was equalized by applying diversity constraint (N e ) in selection. This is a clear message to operational forest tree breeding programs. Without cloning, the superiority of GBLUP is limited to low h 2 , large family sizes, and higher marker coverage (5-10 SNPs/cM). Genetic response in smaller families is limited due to the model oversaturation under dense marker coverage 23 . On the contrary, large family sizes (160 offspring per cross) become impractical in many species, even though they are not prone to oversaturation and provide options for higher selection intensity (larger number of selection candidates). GBLUP provided no additional benefit over the BLUP alternative across the diversity range in production populations scenarios (seed orchards) with clonal replication. There are practical scenarios under which GBLUP could become economically more feasible. These include programs with too costly or unavailable clonal propagation technology. In analogy, SNP genotyping platform has been developed and is currently operationally feasible in a limited number of forest tree species. As forest trees are long-lived perennials with generational intervals spanning decades, the principal added benefit of genomic selection is reducing the breeding cycle's length. Therefore, GBLUP is becoming a viable platform in this context. Our conclusions are relevant to full-scale operational tree breeding programs that capture general combining ability with repeated cycles of control crosses (single-pair mating, progeny testing, and selection). Adaptive genetic response in natural populations, in theory, could be enabled by the GBLUP evaluation based on the SNP chip platform. However, this is limited by the magnitude of genetic covariance, i.e., the product of genetic coancestry and the respective genetic variance in natural populations. Future research could investigate, by stochastic simulation, the genomic-based single-step model (HBLUP) augmented by clonal replication in progeny testing.

Data availability
In our study, we described a stochastic simulation model and compared hypothetical breeding strategies. No real-world data of any species have been used throughout the study. However, output data have been outlined and published in tables and figures included in the manuscript. The complete R code was submitted as a compressed folder within supplements (S3 file).